Static and dynamic lip feature analysis for speaker verification
نویسندگان
چکیده
As we all known, various speakers have their own talking styles. Hence, lip shape and its movement can be used as a new biometrics and infer the speaker’s identity. Compared with the traditional biometrics such as human face and fingerprint, person verification based on the lip feature has the advantage of containing both static and dynamic information. Many researchers have demonstrated that incorporating dynamic information such as lip movement help improve the verification performance. However, which is more discriminative, the static features or the dynamic features remained unsolved. In this paper, the discriminative power analysis of the static and dynamic lip features is performed. For the static lip features, a new kind of feature representation including the geometric features, contour descriptors and texture features is proposed and the Gaussian Mixture Model (GMM) is employed as the classifier. For the dynamic features, Hidden Markov Model (HMM) is employed as the classifier for its superiority in dealing with time-series data. Experiments are carried out on a database containing 40 speakers in our lab. Detailed evaluation for various static/dynamic lip feature representation is made along with a corresponding discussion on the discriminative ability. The experimental results disclose that the dynamic lip shape information and the static lip texture information contain much identity-relevant information. Index Terms — lip feature, feature analysis, speaker verification
منابع مشابه
Variable print quality
In the literature, much research work has been done in the area of speaker verification. The developments include: different types of speaker verification techniques, methods for feature extraction, measures for telephone channel compensation, system robustness etc. In contrast, the problem of acoustic feature selection for speaker verification has been relatively neglected. Hence our aim is to...
متن کاملA New Lip Feature Representation Method for Video-based Bimodal Authentication
As the low-cost video transmission becomes popular, video based bimodal (audio and visual) authentication has great potential in various applications that require access control. It is especially useful for handheld terminals, which are often used under adverse environments, where the signal quality is rather poor. When human voice is used for authentication, one of the most relevant visual fea...
متن کاملSpeaker verification by integrating dynamic and static features using subspace method
In speaker recognition, it is a problem that variation of speech features is caused by sentences and time difference. Speech data includes a phonetic information and a speaker information. If they are separated each other, robust speaker verification will be realized by using only the speaker information. However, it is difficult to separate the speaker information from the phonetic information...
متن کاملCascading appearance-based features for visual speaker verification
The cascading appearance-based (CAB) feature extraction technique has established itself as the state of the art in extracting dynamic visual speech features for speech recognition. In this paper, we will focus on investigating the effectiveness of this technique for the related speaker verification application. By investigating the speaker verification ability of each stage of the cascade we w...
متن کاملMultifactor Fusion for Audio-Visual Speaker Recognition
In this paper we propose a multifactor hybrid fusion approach for enhancing security in audio-visual speaker verification. Speaker verification experiments conducted on two audiovisual databases, VidTIMIT and UCBN, show that multifactor hybrid fusion involve a combination feature-level fusion of lip-voice features and face-lip-voice features at score-level is indeed a powerful technique for spe...
متن کامل